Distributed Data Mining in the Grid Environment

نویسندگان

  • Dr. P. Alli
  • C. B. SelvaLakshmi
  • S. Murali
چکیده

Grid computing has emerged as an important new branch of distributed computing focused on large-scale resource sharing and high-performance orientation. In many applications, it is necessary to perform the analysis of very large data sets. The data are often large, geographically distributed and it’s complexity is increasing. In these area grid technologies provides effective computational support for applications such as knowledge discovery. This paper is an introduction to Grid infrastructure and its potential for machine learning tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Workflow-based Tasks Scheduling on Grid

Due to the distributed nature of data and the need for high performance, it makes Grid a suitable environment for distributed data mining. Since distributed data mining applications are typically data intensive, one of the main requirements of such a DDM Grid environment is the efficient workflow scheduling. We propose an architecture for a Knowledge Grid scheduler that results in the minimal r...

متن کامل

A Grid Data Mining Architecture for Learning Classifier Systems

Recently, there is a growing interest among the researchers and software developers in exploring Learning Classifier System (LCS) implemented in parallel and distributed grid structure for data mining, due to its practical applications. The paper highlights the some aspects of the LCS and studying the competitive data mining model with homogeneous data. In order to establish more efficient dist...

متن کامل

A Survey of Dynamic Replication Strategies for Improving Response Time in Data Grid Environment

Large-scale data management is a critical problem in a distributed system such as cloud,P2P system, World Wide Web (WWW), and Data Grid. One of the effective solutions is data replicationtechnique, which efficiently reduces the cost of communication and improves the data reliability andresponse time. Various replication methods can be proposed depending on when, where, and howreplicas are gener...

متن کامل

Distributed data mining in grid computing environments

The computing-intensive data mining for inherently Internet-wide distributed data, referred as Distributed Data Mining (DDM), calls for the support of a powerful Grid with an effective scheduling framework. DDM often shares the computing paradigm of local processing and global synthesizing. It involves every phase of Data Mining (DM) processes, which makes the workflow of DDM very complex and c...

متن کامل

Applying Grid Technologies to Distributed Data Mining

The Grid promises improvements in the effectiveness with which global businesses are managed, if it enables distributed expertise to be efficiently applied to the analysis of distributed data. We report an ESRC-funded collaboration between EPCC in Edinburgh and Curtin University of Technology in Perth, Australia, that is applying public-domain Grid technologies to secure data mining within a co...

متن کامل

A New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability

Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012